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ABSTRACT 

When computation is outsourced, the data owner would like to be 
assured that the desired computation has been performed correctly 
by the service provider. In theory, proof systems can give the nec- 
essary assurance, but prior work is not sufficiently scalable or prac- 
tical. In this paper, we develop new proof protocols for verify- 
ing computations which are streaming in nature: the verifier (data 
owner) needs only logarithmic space and a single pass over the in- 
put, and after observing the input follows a simple protocol with 
a prover (service provider) that takes logarithmic communication 
spread over a logarithmic number of rounds. These ensure that the 
computation is performed correctly: that the service provider has 
not made any errors or missed out some data. The guarantee is 
very strong: even if the service provider deliberately tries to cheat, 
there is only vanishingly small probability of doing so undetected, 
while a correct computation is always accepted. 

We first observe that some theoretical results can be modified 
to work with streaming verifiers, showing that there are efficient 
protocols for problems in the complexity classes NP and NC. Our 
main results then seek to bridge the gap between theory and prac- 
tice by developing usable protocols for a variety of problems of 
central importance in streaming and database processing. All these 
problems require linear space in the traditional streaming model, 
and therefore our protocols demonstrate that adding a prover can 
exponentially reduce the effort needed by the verifier. Our experi- 
mental results show that our protocols are practical and scalable. 

1. INTRODUCTION 

Efficient verification of computations has long played a central 
role in computer science. For example, the class of problems N P 
can be defined as the set of languages with certificates of member- 
ship that can be verified in polynomial time [2]. The most general 
verification model is the interactive proof system where there is a 
resource-limited verifier V and a more powerful prover V [2, Chap- 
ter 8]. To solve a problem, the verifier initiates a conversation with 
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the prover, who solves the problem and proves the validity of his 
answer, following an established (randomized) protocol. 

This model can be applied to the setting of outsourcing compu- 
tations to a service provider. A wide variety of scenarios fit this 
template: in one extreme, a large business outsources its data to 
another company to store and process; at the other end of the scale, 
a hardware co-processor performs some computations within an 
embedded system. Over large data, the possibility for error in- 
creases: events like disk failure and memory read errors, which 
are usually thought unlikely, actually become quite common. A 
service provider who is paid for computation also has an economic 
incentive to take shortcuts, by returning an approximate result or 
only processing a sample of the data rather than the full amount. 
Hence, in these situations the data owner (the verifier in our model) 
wants to be assured that the computations performed by the ser- 
vice provider (the prover) are correct and complete, without having 
to take the effort to perform the computation himself. A natural 
approach is to use a proof protocol to prove the correctness of the 
answer. However, existing protocols for reliable delegation in com- 
plexity theory have so far been of theoretical interest: to our knowl- 
edge there have been no efforts to implement and use them. In part, 
this is because they require a lot of time and space for both parties. 
Historically, protocols have required the verifier to retain the full 
input, whereas in many practical situations the verifier cannot af- 
ford to do this, and instead outsources the storage of the data, often 
incrementally as updates are seen. 

In this paper we introduce a proof system over data streams. That 
is, the verifier sees a data stream and tries to solve a (potentially 
difficult) problem with the help of a more powerful prover who 
sees the same stream. At the end of the stream, they conduct a 
conversation following an established protocol, through which an 
honest prover will always convince the verifier to accept its results, 
whereas any dishonest prover will not be able to fool the verifier 
into accepting a wrong answer with more than tiny probability. 

Our work is motivated by developing applications in data out- 
sourcing and trustworthy computing in general. In the increasingly 
popular model of "cloud computing", individuals or small busi- 
nesses delegate the storage and processing of their data to a more 
powerful third party, the "cloud". This results in cost-savings, since 
the data owner no longer has to maintain expensive data storage and 
processing infrastructure. However, it is important that the data 
owner is fully assured that their data is processed accurately and 
completely by the cloud. In this paper, we provide protocols which 
allow the cloud to demonstrate that the results of queries are correct 
while keeping the data owner's computational effort minimal. 

Our protocols only need the data owner (taking the role of veri- 
fier V) to make a single streaming pass over the original data. This 
fits the cloud setting well: the pass over the input can take place 
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incrementally as the verifier uploads data to the cloud. So the ver- 
ifier never needs to hold the entirety of the data, since it can be 
shipped up to the cloud to store as it is collected. Without these 
new protocols, the verifier would either need to store the data in 
full, or retrieve the whole data from the cloud for each query: either 
way negates the benefits of the cloud model. Instead, our methods 
require the verifier to track only a logarithmic amount of informa- 
tion and follow a simple protocol with logarithmic communication 
to verify each query. Moreover, our results are of interest even 
when the verifier is able to store the entire input: they offer very 
lightweight and powerful techniques to verify computation, which 
happen to work in a streaming setting. 

Motivating Example. For concreteness, consider the motivating ex- 
ample of a cloud computing service which implements a key- value 
store. That is, the data owner sends (key, value) pairs to the cloud 
to be stored, intermingled with queries to retrieve the value asso- 
ciated with a particular key. For example, Dynamo supports two 
basic operations: get and put on key, value pairs [9]. In this sce- 
nario, the data owner never actually stores all the data at the same 
time (this is delegated to the cloud), but does see each piece as it is 
uploaded, one at a time: so we can think of this as giving a stream 
of (key, value) pairs. Our protocols allow the cloud to demonstrate 
that it has correctly retrieved the value of a key, as well as more 
complex operations, such as finding the next/previous key, finding 
the keys with large associated values, and computing aggregates 
over the key-value pairs (see Section 1.1 for definitions). □ 

Initial study in this area has identified the two critical parameters 
as the space used (by the verifier) and the total amount of commu- 
nication between the two parties [6]. There are lower bounds which 
show that for many problems, the product of these two quantities 
must be at least linear in the size of the input when the verifier is 
not allowed to reply to the prover [6]. We use the notation of [6] 
and define an (s.t) -protocol to be one where the space usage of V 
is O(s) and the total communication cost of the conversation be- 
tween V and V is 0(f). We will measure both s and t in terms of 
words, where each word can represent quantities polynomial in u, 
the size of the universe over which the stream is defined. We addi- 
tionally seek to minimize other quantities, such as the time costs of 
the prover and verifier, and the number of rounds of interaction. 

Note that if t = the model degenerates to the standard streaming 
model. We show that it is possible to drastically increase the com- 
puting power of the standard streaming model by allowing commu- 
nication with a third party, verifiably solving many problems that 
are known to be hard in the standard streaming model. 

We begin by observing that a key concept in proof systems, the 
low-degree extension of the input can be evaluated in a stream- 
ing fashion. Via prior results, this implies that (1) all problems 
in the complexity class NP have computationally sound protocols, 
so a dishonest prover cannot fool the verifier under standard crypto- 
graphic assumptions; and (2) all problems in NC have statistically 
sound protocols, meaning that the security guarantee holds even 
against computationally unbounded adversaries. These protocols 
have space and communication that is polynomial in the logarithm 
of the size of the input domain, u. These results can be contrasted 
with most results in the streaming literature, which normally apply 
only to one or a few problems at a time [21]. They demonstrate in 
principle the power of the streaming interactive proof model, but 
do not yield practical verification protocols. 

Our main contributions in this paper are to provide protocols 
that are easy to implement and highly practical, for the following 
problems: self join size, inner product, frequency moments, range 
query, range-sum query, dictionary, predecessor, and index. These 
problems are all of considerable importance and all have been stud- 



ied extensively in the standard streaming model and shown to re- 
quire linear space [21]. As a result, approximations have to be 
allowed if sub-linear space is desired (for the first 3 problems); 
some of the problems do not have even approximate streaming al- 
gorithms (the last 5 problems). On the other hand, we solve them 
all exactly in our model. Our results are also asymptotically more 
efficient than those which would follow from the above theoretical 
results for NC problems. Formal definitions are in Section 1.1. 

As well as requiring minimal space and communication for the 
verification, these new protocols are also very efficient in terms 
of both parties' running time. In particular, when processing the 
stream, the verifier spends O(logw) time per element. During veri- 
fication the verifier spends O(logw) time while the (honest) prover 
runs in near-linear time. Thus, while our protocols are secure against 
a prover with unlimited power, an honest prover can execute our 
protocols efficiently. This makes our protocols simple enough to 
be deployed in real computation-outsourcing situations. 

Prior Work. Cloud computing applications have also motivated 
a lengthy line of prior work in the cryptography community on 
"proofs of retrievability", which allow to verify that data is stored 
correctly by the cloud (see [16] and the references therein). In this 
paper, we provide "proofs of queries" which allow the cloud to 
demonstrate that the results of queries are correct while keeping 
the data owner's computational effort minimal. 

Query verification/authentication for data outsourcing has been 
a popular topic recently in the database community. The majority 
of the work still requires the data owner to keep a full copy of the 
original data, e.g., [27]. More recently, there have been a few works 
which adopt a streaming-like model for the verifier, although they 
still require linear memory resources. For example, maintaining a 
Merkle tree [20] (a binary tree where each internal node is a cryp- 
tographic hash of its children) takes space linear in the size of the 
tree. Li et al. [19] considered verifying queries on a data stream 
with sliding windows via Merkle trees, hence the verifier's space 
is proportional to the window size. The protocol of Papadopoulos 
et al. [22] verifies a continuous query over streaming data, again 
requiring linear space on the verifier's side in the worst case. 

Although interactive proof systems and other notions of verifica- 
tion have been extensively studied, they are mainly used to establish 
complexity results and hardness of approximation. Because they 
are usually concerned with answering "hard" problems, the (hon- 
est) prover's time cost is usually super-polynomial. Hence they 
have had little practical impact [26]. Recently, [14] reduced the 
cost of the prover to polynomial. Although of striking generality, 
the protocols that result are still complex, and require (polyloga- 
rithmically) many words of space and rounds of interaction. In 
contrast, our protocols for the problems defined in Section 1.1 re- 
quire only logarithmic space and communication (and nearly linear 
running time for both prover and verifier). Thus we claim that they 
are practical for use in verifying outsourced query processing. 

Our work is most directly motivated by prior work [28, 6] on ver- 
ification of streaming computations that had stronger constraints. 
In the first model [28], the prover may send only the answer to 
the computation, which must be verified by V using a small sketch 
computed from the input stream of size n. Protocols were defined 
to verify identity and near-identity, and so because of the size of 
the answer, had small space (a- =1) and but large communica- 
tion (f = n). Subsequent work showed that problems of showing a 
matching and connectedness in a graph could be solved in the same 
bounds, in a model where the prover's message was restricted to be 
a permutation of the input alone [23]. 

[6] introduced the notion of a streaming verifier, who must read 
first the input and then the proof under space constraints. They 
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allowed the prover to send a single message to the verifier, with 
no communication in the reverse direction. However, this does 
not dramatically improve the computational power. In this model, 
INDEX (see the definition in Section 1.1) can be solved using a 
(y/n,y/n) -protocol and there is also a matching lower bound of 
s ■ t = £l(n) [6] ; note that both (w, 1)- and ( 1 , n)-protocols are easy, 
so the contribution of [6] is achieving a tradeoff between s and t. 
In this paper, we show that allowing more interaction between the 
prover and the verifier exponentially reduces s ■ t for this and other 
problems that are hard in the standard streaming model. 

1.1 Definitions and Problems 

We first formally define a valid protocol: 

DEFINITION 1. Consider a prover V and verifier V who both 
observe a stream A and wish to compute a function fi(A). We 
assume V has access to a private random string 1Z, and one-way 
access to the input A. After the stream is observed, V and V ex- 
change a sequence of messages. Denote the output of V on input 
A, given prover V and random string 1Z, by out(V,A,1Z,P). We 
allow V to output ±ifV is not convinced that V 's claim is valid. 

Call V a valid prover with respect to V if for all streams A, 
Pr n [out(V,A,Tl,P) = P{A)] = 1. Call V a valid verifier for p if 

1. There exists at least one valid prover V with respect to V. 

2. For all provers V' and all streams A, 

Pr n [out(V,A,TZ,V') {P(A),±}\ < 1/3. 

Property 2 of Definition 1 defines statistical soundness. No- 
tice the constant i is arbitrary, and is chosen for consistency with 
standard definitions in complexity theory [2]. This should not be 
viewed as a limitation: note that as soon as we have such a prover, 
we can reduce probability of error to p, by repeating the protocol 
<3(log \/p) times in parallel, and rejecting if any rejects. In fact, our 
protocols let this probability be set arbitrarily small by appropriate 
choice of a parameter (the size of the finite field used), without 
needing to repeat the protocol. 

DEFINITION 2. We say the function P possesses an r-round 
(s,t) protocol, if there exists a valid verifier V for p such that: 

1. V has access to only O(s) words of working memory. 

2. There is a valid prover V for V such that V and V exchange 
at most 2r messages (r messages in each direction), and the sum of 
the lengths of all messages is 0(f) words. 

We define some canonical problems to represent common queries 
on outsourced data, such as in a key- value store. Denote the uni- 
verse from which data elements are drawn, as [u] = {0, — 1}. 

INDEX: Given a stream of u bits b\,...,b tt , followed by an index 
q, the answer is b q . 

DICTIONARY: The input is a stream of n < u (key, value) pairs, 
where both the key and the value are drawn from the universe [u], 
and all keys are distinct. The stream is followed by a query qE [«]. 
If q is one of the keys, then the answer is the corresponding value; 
otherwise the answer is "not found". This exactly captures the case 
of key- value stores such as Dynamo [9]. 

PREDECESSOR Given a stream of n elements in [u], followed by 
a query q <E [u], the answer is the largest p in the stream such that 
p<q.We assume that always appears in the stream. SUCCESSOR 
is defined symmetrically. In a key- value store, this corresponds to 
finding the previous (next) key present relative to a query key. 

RANGE QUERY: Given a stream of n elements in [u], followed by 
a range query the answer is the set of all elements in the 

stream between qi and qn inclusive. 



RANGE-SUM: The input is a stream of n (key, value) pairs, where 
both the key and the value are drawn from the universe [u], and all 
keys are distinct. The stream is followed by a range query [qi,qg\. 
The answer is the sum of all the values with keys between qi and 
qn inclusive. 

SELF-JOIN SIZE: Given a stream of n elements from [u], compute 
Y.ie[u] a f where a,- is the number of occurrences of ( in the stream. 
This is also known as the second frequency moment. 

FREQUENCY MOMENTS : In general, for any integer k > 1 , £ i6 r M i a\ 
is called the k-th frequency moment of the vector a, written F^(a). 

INNER PRODUCT (or JOIN SIZE): Given two streams A and B with 
frequency vectors (a l ,...,a u ) and (b\,...,b u ), compute £ i6 j M ] a,Z>;. 

These queries are broken into two groups. The first four are re- 
porting queries, which ask for elements from the input to be re- 
turned. INDEX is a classical problem that in the streaming model 
requires Sl(u) space [18]. It is clear that PREDECESSOR, DICTIO- 
NARY, RANGE QUERY, RANGE-SUM are all more general than IN- 
DEX and hence, also require linear space. These problems would be 
easy if the query were fixed before the data is seen. But in most ap- 
plications, the user (the verifier) forms queries in response to other 
information that is only known after the data has arrived. For ex- 
ample, in database processing a typical range query may ask for 
all people in a given age range, where the range of interest is not 
known until after the database is instantiated. 

The remaining queries are aggregation queries, computations 
that combine multiple elements from the input. SELF-JOIN SIZE 
requires linear space in the streaming model [1] to solve exactly (al- 
though there are space-efficient approximation algorithms). Since 
Frequency moments and Inner product are more general 
than SELF-JOIN SIZE, they also require linear space. In Section 6, 
we consider more general functions, such as heavy hitters, distinct 
elements (Fq), and frequency-based functions. These functions also 
require linear space to solve exactly, and certain functions like F max 
require polynomial space even to approximate [21]. These are ad- 
ditional functionalities that an advanced key- value store might sup- 
port. For example, Fq returns the number of distinct keys which 
are currently active, and the heavy hitters are the keys which have 
the largest values associated with them. These functions are also 
important in other contexts, e.g., tracking the heavy hitters over 
network data corresponds to the heaviest users or destinations [21]. 

Outline. In Section 2, we describe how existing proof systems 
can be modified to work with streaming verifiers, thereby provid- 
ing space- and communication-efficient streaming protocols for all 
of NP and NC respectively. Subsequently, we improve upon these 
protocols for many problems of central importance in streaming 
and database processing. In Section 3 we give more efficient proto- 
cols to solve the aggregation queries (exactly), and in Section 4 we 
provide protocols for the reporting queries. In both cases, our pro- 
tocols require only O(logw) space for the verifier V, and O(logw) 
words of communication spread over logw rounds. An experimen- 
tal study in Section 5 shows that these protocols are practical. In 
Section 6 we extend this approach to a class of frequency-based 
functions, providing protocols requiring O(logw) space and logw 
rounds, at the cost of more communication. 

2. PROOFS AND STREAMS 

We make use of a central concept from complexity theory, the 
low-degree extension (LDE) of the input, which is used in our pro- 
tocols in the final step of checking. We explain how an LDE com- 
putation can be made over a stream of updates, and describe the 
immediate consequences for prior work which used the LDE. 
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Input Model. Each of the problems described in Section 1 . 1 above 
operates over an input stream. More generally, in all cases we can 
treat the input as defining an implicit vector a, such that the value 
associated with key ( is the rth entry, a,-. The vector a has length u, 
which is typically too large to store (e.g. u = 2 128 if we consider 
the space of all possible IPv6 addresses). At the start of the stream, 
the vector a = (oq, . . ., a u _ [ ) is initialized to 0. Each element in the 
stream is a pair of values, (/, 8) for integer S. A pair (i, S) in the 
stream updates a, <— a, + 8. This is a very general scenario: we can 
interpret pairs as adding to a value associated with each key (we 
allow negative values of 8 to capture decrements or deletions). Or, 
if each i occurs at most once in the stream, we can treat ((', 8) as 
associating the value 8 with the key i. 

Low-Degree Extensions. Given an input stream which defines a 
vector a, we define a function / a which is used in our protocols to 
check the prover's claims. Conceptually, we think of the vector a 
in terms of a. function / a , so that f a (i) = a;. By interpolation, / a (i) 
can be represented as a polynomial, which is called the low-degree 
extension (LDE) of a [2]. LDEs are a standard tool in complexity 
theory. They give V surprising power to detect deviations by V 
from the prescribed protocol. 

Given the LDE polynomial / a , we can also evaluate it at a loca- 
tion r > u. In our protocols, the verifier picks a secret location r 
and computes f a (r). In what follows, we formalize this notion, and 
explain how it is possible to compute / a (r) in small space, incre- 
mentally as stream updates are seen. 

First, we conceptually rearrange the data from a one-dimensional 
vector to a d dimensional array. We let integer £ be a parameter, 
and assume for simplicity that u = £ d is a power of £. Let a = 
(a\, . . . ,a u ) be a vector in [u] u . We first interpret a as a function / a : 
[£] d — > [u] as follows: letting denote the fc-th least significant 
digit of i in base-£ representation, we associate each i £ [u] with a 
vector ((Of, (1)2. • ■ ■ . (0$) e V\ d , and define / a (i) = a t . 

Pick a prime p such that u< p. The low-degree extension (LDE) 
of a is a d-variate polynomial / a over the field Z p so that / a (x) = 
/ a (x) for all xe[£] d . Notice since / a is a polynomial over the 
field Z p , / a (x) is defined for all x <E [p] d ; / a essentially extends the 
domain of / a from [£] d to [p] d . Let x = (x\, . . . ,xj) £ \p] d be a point 
in this d dimensional space. The polynomial / a : [p ] d — > Z p can be 
defined in terms of an indicator function % y which is 1 at location 
v = (vi , . . . , Vrf) e [£] d and zero elsewhere in [£] d via 

where Xk( x j) is the Lagrange basis polynomial given by 

(xj-0)-(xj-(k-l))(xj-(k+l))-(xj-(e-l)) 
{k-Q)---{k-{k-\)){k-{k+\))---{k-{£-\)) ' {> 

which has the property that %k ( x j) = liixj =k and for all Xj ^ k, 
xj <E [£]■ We then define / a (x) = Y.\e[t\ d fl v^v(x), which meets the 
requirement that / a (x) = / a (x) when x e [£] d 

Streaming Computation of LDE. We observe that while the poly- 
nomial / a is defined over the very large domain [p] d , it is actually 
very efficient to evaluate / a (r) for some re [p\ d even when the in- 
put a is defined incrementally by a stream as in our input model. 
This follows from substituting r into (1): we obtain 

/a(r) =Eve[^«vZv(r). (3) 

Now observe that for fixed r this is a linear function of a: a sum 
of multiples of the entries a y . So to compute / a (r) in a streaming 
fashion, we can initialize / (r) = 0, and process each update (i, 8): 

/a(r)^/ a (r) + S* v(i) (r) (4) 



where v(i) denotes the (canonical) remapping of i into [£] d . Note 
that X\( r ) can b e computed in (at most) 0(d£) field operations, via 
(2); and V only needs to keep / a (r) and r, which takes d+ 1 words 
in [p] . Hence, we conclude 

THEOREM 1. The LDE / a (r) can be computed over a stream 
of updates using space 0(d) and time per update 0(£d). 

Initial Results. We now describe results which follow by com- 
bining the streaming computation of LDE with prior results. De- 
tailed analysis is in Appendix A. The constructions of [14] (respec- 
tively, [17]) yield small-space non-streaming verifiers and polylog- 
arithmic communication for all problems in log-space uniform NC 
(respectively, NP), and achieve statistical (respectively, computa- 
tional) soundness. The following theorems imply that both con- 
structions can be implemented with a streaming verifier. 

THEOREM 2. There are computationally sound (polylogw, logw) 
protocols for any problem in N P. 

Although Theorem 2 provides protocols with small space and 
communication, this does not yield a practical proof system. Even 
ignoring the complexity of constructing a PCP, the prover in a Uni- 
versal Argument may need to solve an N P-hard problem just to 
determine the correct answer. However, Theorem 2 does demon- 
strate that in principle it is possible to have extremely efficient veri- 
fication systems with streaming verifiers even for problems that are 
computationally difficult in a non-streaming setting. 

Theorem 3 (Extending Theorem 3 in [14]). There are 
statistically sound ( poly log u, poly log u) protocols for any problem 
in log-space uniform NC. 

Here, NC is the class of all problems decidable by circuits of poly- 
nomial size and polylogarithmic depth; equivalently, the class of 
problems decidable in polylogarithmic time on a parallel computer 
with a polynomial number of processors. This class includes, for 
example, many fundamental matrix problems (e.g. determinant, 
product, inverse), and graph problems (e.g. minimum spanning 
tree, shortest paths) (see [2, Chapter 6]). Despite its powerful gen- 
erality, the protocol implied by Theorem 3 is not optimal for many 
important functions in streaming and database applications. The 
remainder of this paper obtains improved, practical protocols for 
the fundamental problems listed in Section 1.1. 

3. INTERACTIVE PROOFS FOR 
AGGREGATION QUERIES 

We describe a protocol for the aggregation queries with a quadratic 
improvement over that obtained from Theorem 3. 

3.1 self-ioin size Queries 

We first explain the case of SELF-JOIN SIZE, which is F2 = 
Lig[ti] a ?- m tne SELF-JOIN SIZE problem we are promised 5 = 1 
for all updates (;, 8), but our protocol works even if we allow any 
integer 8, positive or negative. This generality is useful for other 
queries considered later. 

As in Section 2, let integer £ > 2 be a parameter to be deter- 
mined. We assume that u is a power of £ for ease of presentation. 
Pick prime p so u < p < 2u (by Bertrand's Postulate, such a p al- 
ways exists). We also assume that p is chosen so that F2 = O(p), 
to keep the analysis simple. The protocol we propose is similar 
to sum-check protocols in interactive proofs (see [2, Chapter 8]); 
given any rf-variate polynomial g over Z p , a sum-check protocol 
allows a polynomial-time verifier V to compute Y.zeH d 8( z ) f° r an Y 
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H C Zp, as long as V can evaluate g at a randomly-chosen location 
in polynomial time. A sum-check protocol requires d rounds of in- 
teraction, and the length of the j'th message from V to V is equal 
to deg,(g), the degree of g in the i'th variable. 

Let a 2 denote the entry-wise square of a. A natural first attempt 
at a protocol for F2 is to apply a sum-check protocol to the LDE / a 2 
of a 2 i.e. / a 2 = T,ye[l] d a \Xv However, a streaming verifier cannot 
evaluate / a 2 at a random location because a 2 is not a linear trans- 
form of the input. The key observation we need is that a streaming 
verifier can work with a different polynomial of slightly higher de- 
gree that also agrees with a 2 on [£] d . Specifically, the polynomial 
/ 2 = (£ ve m<i civX\) 2 - That is, V can evaluate the polynomial / 2 at 
a random location r : V computes / a (r) as in Section 2, and uses the 
identity / 2 (r) = / a (r) 2 . We can then apply a sum-check protocol 
to / 2 in our model; details follow. 

The protocol. Before observing the stream, the verifier picks a 
random location r = fa, . . . , /•</) e [p] d . Both prover and verifier 
observe the stream which defines a. The verifier V evaluates the 
LDE / a (r) in incremental fashion, as described in Section 2. 

After observing the stream, the verification protocol proceeds in 
d rounds as follows. In the first round, the prover sends a polyno- 
mial gi (x\), and claims that 

Sl(xi) =L X2 ^ d e[f]"-' fa( x i,x 2 ,. . . ,x d ). (5) 

Observe that if g\ is as claimed, then F2 (a) =Y, Xl e[i]8l( x l)- 
Since the polynomial g\(x\) has degree 2(1 — 1), it can be de- 
scribed in 2(£ — 1) + 1 words. 

Then, in round j > 1, the verifier sends Tj_\ to the prover. In 
return, the prover sends a polynomial gj(xj), and claims that 

8j( x j) = L fi(n,---,rj-i,Xj,Xj +u ...,x d ). (6) 

x j+i ,...,x d e\t\ d -i 

The verifier compares the two most recent polynomials by checking 

gj-l{rj-\)=I, Xj e[t] 8j(xj) 

and rejecting otherwise. The verifier also rejects if the degree of g 

is too high: each g should have degree 2(£ — 1). 

In the final round, the prover has sent g d which is claimed to be 

gd(xd)= fi{n,---,r d -\,Xd) 

The verifier can now check that g d (r d ) = / 2 (r) (recall that the 
verifier tracked / a (r) incrementally in the stream). If this test suc- 
ceeds (and so do all previous tests), then the verifier accepts, and is 
convinced that /"2(a) = T,x,e[£]8l( x l)- We defer the detailed proof 
of correctness and the analysis of the prover's cost to Appendix B . 1 . 

Analysis of space and communication. The communication cost 
of the protocol is dominated by the polynomials being sent by the 
prover. Each polynomial can be sent in 0(£) words, so over the d 
rounds, the total cost is 0(d£) communication. The space required 
by the verifier is bounded by having to remember r, / a (r) and a 
constant number of polynomials (the verifier can "forget" interme- 
diate polynomials once they have been checked). The total cost of 
this is 0(d + £) words. Probably the most economical tradeoff is 
reached by picking £ = 2 and d = logw, yielding both communi- 
cation and space cost for V of 0(\ogu) words. Combining these 
settings with Lemma 1 and the analysis in Appendix B.l, we have: 

'it is possible to tradeoff smaller space for more communication 
by, say, setting £ = log e u and d = £l *°^' gM for any small con- 
stant e > 0, which yields a protocol with 0( ^°f^ gu ) space and 
0(log 1+£ w) communication. 



THEOREM 4. There is a (log u, log u) -protocol for SELF-JOIN 
SIZE with probability of failure O(^p). The prover's total time is 
0(min(w,nlogw/«)); the verifier takes time O(logw) per update. 
Remarks. Lemma 1 in Appendix B.l shows that the failure proba- 
bility is 2£d/p — Alogu/p. It can be made as low as O(^P^) for 
any constant c, by choosing p larger than u c , without changing the 
asymptotic bounds. Notice that the smallest-depth circuit comput- 
ing/^ has depth 0(logw), as any function that depends on all bits of 
the input requires at least logarithmic depth. Therefore, Theorem 3 
yields a (log 2 u, log 2 u) -protocol for F2, and our protocol represents 
a quadratic improvement in both parameters. 

3.2 Other Problems 

Our protocol for F2 can be easily modified to support the other 
aggregation queries listed in Section 1.1. 

Higher frequency moments. The protocol outlined above natu- 
rally extends to higher frequency moments, or the sum of any poly- 
nomial function of a;. For example, we can simply replace / a with 
/a in (5) and (6) to compute the k-th frequency moment F k (again, 
assuming u is chosen large enough so F k < u). The communication 
cost increases to O(klogu), since each gj now has degree 0(k) and 
so requires correspondingly more words to describe. However, the 
verifier's space bound remains at O(logw) words. 

Inner product. Given two streams defining two vectors a and b, 
their inner product is defined by a • b = Y,ie[u] a i^i- Observe that 
i<2(a + b) = i^(a) + F2(b) + 2a • b. Hence, the inner product can 
be verified by verifying three F2 computations. 

More directly, the above protocol for F2 can be adapted to ver- 
ify the inner product: instead of providing polynomials which are 
claimed to be sums of / 2 , we now have two LDEs / a and ft, which 
encode a and b respectively. The verifier again picks a random 
r, and evaluates LDEs / a (r) and /b(r) over the stream. The prover 
now provides polynomials that are claimed to be sums of / a /b- This 
observation is useful for the RANGE- SUM problem. 

Range-sum. It is easy to see that RANGE-SUM is a special case 
of INNER PRODUCT. Here, every (key, value) pair in the input 
stream can considered as updating i =key with 8 =value to gener- 
ate a. When the query [qi.qn] is given, the verifier defines b qL = 
■ ■ ■ = b qR = 1 and = for all other i. One technical issue is that 
computing /b(r) directly from the definition requires 0(ulogu) 
time. However, the verifier can compute it much faster for such 
b. Again fix £ — 2. Decompose the range [qi, a R] into 0(logu) 
canonical intervals where each interval consists of all locations v 
where Vj + \ .... , v d are fixed while all possible (vi, . . . , Vj) £ [2]-* 
for some j occur. The value of /b(r) in each such interval is 

/b(r) = X (v, ,..., Vj )e[2]JX{ Vl ,...,v d ) ( r ) 
j d 

l n*v t (»•*)• n xvM) 

(vi,...,v,)e[2]/*=l k=j+l 

= n 1 tWo)) 

k=j+l V( Vli ... jV ,)g[2]/£=l / 

d / j \ d 

= n xv k (r k )- n Mo-) +*i fa)) = n *v t fao, 

k=j+l \k=l / k=j+l 

which can be computed in 0(logu) time. The final evaluation is 
found by summing over the O(logw) canonical intervals, so the 
time to compute /b(r) is 0(log 2 «). This is used to determine 
whether g d (r d ) = / a (r)/b(r). Hence, the verifier can continue the 
rest of the verification process in O(logw) rounds as before. 
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Figure 1: Example tree T over input vector [2,3,8, 1,7,6,4,3] 
and sub- vector query (1,5). 

4. INTERACTIVE PROOFS FOR 
REPORTING QUERIES 

We first present an interactive proof protocol for a class of SUB- 
VECTOR queries, which is powerful enough to incorporate INDEX, 
Dictionary, Predecessor, and Range query as special cases 

4.1 Sub-vector Queries 

As before, the input is a stream of n pairs (i, 8), which sets 
ai <— aj + 8, defining a vector a = (ai,...,a u ) in [«]". The cor- 
rect answer to a SUB-VECTOR query specified by a range [q^.q^] 
is the k nonzero entries in the sub- vector (a qL , a qR ) . 

The protocol. Let p be a prime such that u< p < 2u. The verifier 

V conceptually builds a tree T of constant degree I on the vector a. 

V first generates log u independent random numbers r\ , . . . , „ 
uniformly from [p\. For simplicity, we describe the case for 1 = 2. 
For each node v of the tree, we define a "hash" value as follows. 
For the i'-th leaf v, set v = a/. For an internal node v at level j (the 
leaves are at level 0), define 

v = v L + v R rj, (7) 

where \>i and vr are the left and right child of v, respectively. Addi- 
tions and multiplications are done over the field X p as in Section 3. 
Denote the root of the tree by t. The verifier is only required to 
keep r\ , . . . , r\ ogu and t. Later we show that V can compute t with- 
out materializing the binary tree T, and that this is essentially an 
LDE computation. 

We first present the interactive verification protocol between V 
and V after the input has been observed by both. The verifier only 
needs r\ , . . . , n ogH , /, and the query range [qi, qR] to carry out the 
protocol. First V sends qi and qR to V, and V returns the claimed 
sub-vector, say, a' qL a' qR (V actually only needs to return the k 
nonzero entries). In addition, if qi is even, V also returns a! 
if qR is odd, V also returns a' q/j+l . Then V tries to verify whether 
a,- = a'j for all qi<i< qR using the following protocol. The general 
idea is to reconstruct T using information provided by "P. If V 
is behaving correctly, the (hash of the) reconstructed root, say t' , 
should be the same as t ; otherwise with high probability t' 7^ t and 

V will reject. Define yv) (;') to be the ancestor of the i-th leaf of T 
on level j. The protocol proceeds in log;; — 1 rounds, and maintains 
the invariant that after the j-th round, V has reconstructed (;') 
for all qi < 1 ' < qR. The invariant is easily established initially (j = 
0) since V provides aL L , a! qR and the siblings of a' qt and a qR 
if needed. In the j-th round, V sends rj to V. Having r\,...,rj 
to hand, V can construct the j-th level of T. V then returns to 

V the siblings of y^Hqi) and v (<7fi) if they are needed by V. 
Then V reconstructs yO'+l) (/) for all qi < 1 < qR. At the end of the 



(logM— l)-th round, V has reconstructed y( lo S") ((') = t' , and checks 
that t = t'. If so, then the initial answer provided by V is accepted, 
otherwise it is rejected. 

Example. Figure 1 shows a small example on the vector a = 
[2,3, 8, 1, 7,6,4, 3]. We fix the hash function parameters r = [1, 1, 1] 
to keep the example simple (ordinarily these parameters are cho- 
sen randomly), and show the hash value inside each node. For 
the range (2,6), in the first round the prover reports the sub-vector 
[3,8,1,7,6] (shown highlighted). Since the left endpoint of this 
range is even, V also reports a\ = 2. From this, V is able to com- 
pute some hashes at the next level: 5, 9 and 13. After sending r\ to 
V, V received the fact that the hash of the range (7, 8) is 7. From 
this, V can compute the final hash values and check that they agree 
with the precomputed hash value of t, 34. □ 
We prove the next theorem in Appendix B.2. 

THEOREM 5. There is a (\ogu,\ogu + k)-protocolfor SUB- 
VECTOR, with failure probability O(-^p). The prover' s total time 
is 0(min(u,nlogu/n)), the verifier takes time 0(logu) per update. 

4.2 Answering Reporting Queries 

We now show how to answer the reporting queries using the so- 
lution to SUB-VECTOR. 

• It is straightforward to solve RANGE QUERY using SUB-VECTOR: 
each element ( in the stream is interpreted as a vector update with 
5 = 1, and vector entries with non-zero counts intersecting the 
range give the required answer. 

• INDEX can be interpreted as a special case of RANGE QUERY 
with q L = qR= q. 

• For DICTIONARY, we must distinguish between "not found" and 
a value of 0. We do this by using a universe size of [u + 1] for the 
values: each value is incremented on insertion. At query time, if 
the retrieved value is 0, the result is "not found"; otherwise the 
value is decremented by 1 and returned. 

• For PREDECESSOR, we interpret each key in the stream as an 
update with 5 = 1. In the protocol V first asks for the index of 
the predecessor of q, say q', and then verifies that the sub- vector 
(a q i ,. . . ,a q ) = (1,0,.. .,0), with communication cost 0(logj() 
(since k = 0). 

COROLLARY 1. There is a (logu,logu)-protocol for DICTIO- 
NARY, Index and Predecessor where the prover takes time 
0(min(t(,nlogt(/n)). There is a (logM,log(«) + k)-protocol for 
RANGE QUERY where the prover's time is 0(k+min(u,n\ogu/n)). 
For all protocols, the verifier takes time 0(log;(). 

5. EXPERIMENTAL STUDY 

We performed a brief experimental study to validate our claims 
that the protocols described are practical. We compared protocols 
for both the reporting queries and aggregates queries. Specifically, 
we compared the multi-round protocols for F2 described in Section 
3 to the single round protocol given in [6], which can be seen as 
a protocol in our setting with d = 2 and £ = \fii. For reporting 
queries, we show the behavior of our SUB-VECTOR protocol, and 
we present experimental results when the length qR — qi of the sub- 
vector queried is 1000. Together, these determine the performance 
of the 8 core queries: the three aggregate queries are based on the 
F2 protocol, while the five reporting queries are based on the SUB- 
VECTOR protocol. 

Our implementation was made in C++: it performed the com- 
putations of both parties, and measured the resources consumed by 
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the protocols. All programs were compiled with g++ using the -03 
optimization flag. For the data, we generated synthetic streams with 
u = n where the number of occurrences of each item i was picked 
uniformly in the range [0, 1000]. Note that the choice of data does 
not affect the behavior of the protocols: their guarantees do not de- 
pend on the data, but rather on the random choices of the verifier. 
The computations were made over the field of size p = 2 61 — 1 , giv- 
ing a probability of 4 • 6 1 /p ~ 1 0~ 1 6 of the verifier being fooled by 
a dishonest prover. These computations were executed using native 
64-bit arithmetic, so increasing this probability is unlikely to af- 
fect performance. This probability could be reduced further to, e.g. 
4 • 127/(2 127 - 1) < 10~ 35 , at the cost of using 128 bit arithmetic. 

We evaluated the protocols on a single core of a multi-core ma- 
chine with 64-bit AMD Opteron processors and 32 GB of mem- 
ory available. The large memory let us experiment with universes 
of size several billion, with the prover able to store the entire fre- 
quency vector in memory. We measured all relevant costs: the time 
for V to compute the check information from the stream, for V to 
generate the proof, and for V to verify this proof. We also measured 
the space required by V, and the size of the proof provided by V. 
Experimental Results. When the prover was honest, both proto- 
cols always accepted. We also tried modifying the prover's mes- 
sages, by changing some pieces of the proof, or computing the 
proof for a slightly modified stream. In all cases, the protocols 
caught the error, and rejected the proof. We conclude that the pro- 
tocols work as analyzed, and the focus of our experimental study is 
to understand how they scale to large volumes of data. 

Figure 2 shows the behavior of the F2 protocols as the data size 
varies. First, Figure 2(a) shows the time for V to process the stream 
to compute the necessary LDEs as the stream length increases. 
Both show a linear trend (here, plotted on log scale). Moreover, 
both take comparable time (within a constant factor), with the multi- 
round verifier processing about 21 million updates per second, and 
the single round V processing 35 million. The similarity is not sur- 
prising: both methods are taking each element of the stream and 
computing the product of the frequency with a function of the ele- 
ment's index i and the random parameter r. The effort in computing 
this function is roughly similar in both cases. The single round V 
has a slight advantage, since it can compute and use lookup tables 
within the 0{\fu) space bound [6], while the multi-round verifier 
limited to logarithmic space must recompute some values multiple 
times. The time to check the proof is essentially negligible: less 
than a millisecond across all data sizes. Hence, we do not consider 
this a significant cost. 

Figure 2(b) shows a clear separation between the two methods 
in Vs effort in generating the proof. Here, we measure total time 
across all rounds in the multi-round case, and the time to generate 
the single round proof. The cost in the multi-round case is dramat- 
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ically lower: it takes minutes to process inputs with u = 2 22 in the 
single round case, whereas the same data requires just 0.2 seconds 
when using the multi-round approach. Worse, this cost grows with 
m 3 / 2 , as seen with the steeper line: doubling the input size increases 
the cost by a factor of 2.8. In contrast, the multi-round cost grew 
linearly with u. Across all values of u, the multi-round prover pro- 
cessed 20-21 million updates per second. Meanwhile, at u = 2 20 , 
the single-round V processed roughly 40,000 updates per second, 
while at w = 2 24 , V processed only 10,000. Thus the chief bottle- 
neck of these protocols seems to be Vs time to make the proof. 

The trend is similar for the space resources required to execute 
the protocol. In the single round case, both the verifier's space and 
size of the proof grow proportional to ^/u. This is not impossibly 
large: Figure 2(c) shows that for u of the order of 1 billion, both 
these quantities are comfortably under a megabyte. Nevertheless, 
it is still orders of magnitude larger than the sizes seen in the multi- 
round protocol: there, the space required and proof size are never 
more than 1KB even when handling gigabytes of data. 

The results for reporting (SUB-VECTOR) queries are quite sim- 
ilar (Figure 3). Here, there are no comparable protocols for this 
query. The verifier's time is about the same as for the F2 query: 
unsurprising, as in both protocols V evaluates the LDE of the input 
at a point r. The prover's time is similarly fast, since the amount of 
work it has to do is about the same as the verifier (it has to compute 
hash values of various substrings of the input). The space cost of 
the verifier is minimal, primarily just to store r and some interme- 
diate values. The communication cost is dominated by the cost of 
reporting the answer (1000 values): the rest is less than 1KB. 

Our experiments focus on the case u = n. We can extrapolate 
the prover's cost, which scales as 0(min(«,nlog«/;j)), to larger 
examples. Consider 1TB of IPv6 web addresses; this is approxi- 
mately 6 x 10 10 IPv6 addresses, each drawn over a logw = 128 bit 
domain. Figure 2(b) shows that processing 10 10 updates from a 
domain of size 10 10 takes approximately 500 seconds. In our IPv6 
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example, the input has 6 times more values, and the value of logu 
is approximately 4 times larger, so extrapolating we would expect 
our (uniprocessor) prover to take about 24 times longer to process 
this input, i.e. about 12,000 seconds (200 minutes). Note that this 
is comparable to the time to read this much data resident on disk 
[13]. 

In summary, the methods we have developed are applicable to 
genuinely large data sets, defined over a domain of size hundreds 
of millions to billions. Our implementation is capable of processing 
such datasets within a matter of seconds or minutes. 

6. EXTENSIONS 

We next consider how to treat other functions in the streaming 
interactive proof setting. We first consider some functions which 
are of interest in streaming, such as heavy hitters, and fc-largest. We 
then discuss extensions of the framework to handle a more general 
class of "frequency-based functions". 

6.1 Other Specific Functions 

Heavy Hitters. The heavy hitters (HHs) are those items whose 
frequencies exceed a fraction (j) of the total stream length n. In 
verifying the claimed set of HHs, V must ensure that all claimed 
HHs indeed have high enough frequency, and moreover no HHs 
are omitted. To convince V of this, V will combine a succinct wit- 
ness set with a generalization of the SUB-VECTOR protocol to give 
a ( 1 /<j) log u, 1/0 log u) protocol for verifying the heavy hitters and 
their frequencies. As in our SUB-VECTOR protocol, V conceptu- 
ally builds a binary tree T with leaves corresponding to entries of 
a, and a random hash function associated with each level of T. We 
augment each internal node v with a third child c v . c v is a leaf 
whose value is the sum of the frequencies of all descendents of v, 
the subtree count of v. The hash function now takes three argu- 
ments as input. It follows that V can still compute the hash t of the 
root of this tree in logarithmic space, and O(logw) time per update. 

In the Zth round, the prover lists all leaves at level I whose sub- 
tree count is at least <j)n, their siblings, as well as their hash value 
and their subtree counts (so the hash of their parent can be com- 
puted). In addition, V provides all leaves whose subtree count is 
less than (j)n but whose parent has subtree count at least <j)n; these 
nodes serve as witnesses to the fact that none of their descendants 
are heavy hitters, enabling V to ensure that no heavy hitters are 
omitted. This procedure is repeated for each level of T; note that 
for each node v whose value V provides, all ancestors of v and their 
siblings (i.e. all nodes on v's "authentication path") are also pro- 
vided, because the subtree count of any ancestor is at least as high 
as the subtree count of v. Therefore, V can compare the hash of 
the root (calculated while observing the stream) to the value pro- 
vided by V, and the proof of soundness is analogous to that for the 
Sub-vector protocol. 

In total, there are at most 0( I /(f) log u) nodes provided by V: for 
each level /, the sum of the sub-tree counts of nodes at level / is 
n, and therefore there are 0(1/0) nodes at each level which have 
sub-tree count exceeding <j)n or whose parent has subtree count ex- 
ceeding <j)n. Hence, the size of the proof is at most 0(l/01ogw), 
and the time costs are as for the SUB-VECTOR protocol. 

The protocol cost can be improved to (logw, l/01ogw), i.e. we 
do not require V to store the heavy hitter nodes. This is accom- 
plished by having the prover, at each level of T, "replay" the hash 
values of all nodes listed in the previous round. V can keep a sim- 
ple fingerprint of the identities and hash values of all nodes listed 
in each round (computing their hash values internally), and com- 
pare this to a fingerprint of the hash values and identities listed by 



V. If these fingerprints match for each level, V is assured that the 
correct information was presented. Note each node is repeated just 
once, so this only doubles the communication cost. This reduced 
cost protocol is used in Section 6.2. 

/c-largest. Given the same set up as the PREDECESSOR query, the 
fc-th largest problem is to find the largest p in the stream such that 
there are at least k — 1 larger values p' also present in the stream. 
This can be solved by the prover claiming that the kth largest item 
occurs at location j, and performing the range query protocol with 
the range (j, u), allowing V to check that there are exactly k distinct 
items present in the range. This has a cost of (logu,k + logu). For 
large values of k, alternative approaches via range sum (assuming 
all keys are distinct) can reduce the cost to (logw,logw). 

6.2 Frequency-based Functions 

Given the approach described in Section 3, it is natural to ask 
what other functions can be computed via sum-check protocols ap- 
plied to carefully chosen polynomials. By extending the ideas from 
the protocol of Section 3, we get protocols for any statistic F of the 
form F(a) = £,- s y h (a,). Here, h : No — > No is a function over fre- 
quencies. Any statistic F of this form is called a frequency-based 
function. Such functions occupy an important place in the stream- 
ing world. For example, setting h(x) = x 2 gives the self-join size. 
We will subsequently show that using functions of this form we can 
obtain non-trivial protocols for problems including: 

• Fo, the number of distinct items in stream A. 

• F max , the frequency of the most-frequent item in A. 

• Point queries on the inverse distribution of A. That is, for any (', 
we will obtain protocols for determining the number of tokens 
with frequency exactly i. 

The Protocol. A natural first attempt to extend the protocols of 
Section 3 to this more general case is to have V compute / a (r) as 
in Section 3, then have V send polynomials which are claimed to 
match sums over h(f a (x)). In principal, this approach will work: 
for the F2 protocol, this is essentially the outline with h(x) = x 2 . 
However, recall that when this technique was generalized to F^ for 
larger values of k, the cost increased with k. This is because the de- 
gree of the polynomial h increased. In general, this approach yields 
a solution with cost deg(/z) logn. This does not yet yield interesting 
results, since in general, the degree of h can grow arbitrarily high, 
and the resulting protocol is worse than the trivial protocol which 
simply sends the entire vector a at a cost of 0(min (n, u)). 

To overcome this obstacle, we modify this approach to use a 
polynomial function h with bounded degree that is sublinear in n 
and u. At a high level, we "remove" any very heavy elements from 
the stream A before running the protocol of Section 3.1, with f a 
replaced by hof a for a suitably chosen polynomial h. By removing 
all heavy elements from the stream, we keep the degree of h (rel- 
atively) low, thereby controlling the communication cost. We now 
make this intuition precise. 

Assume n = 0(u) and let <j) = w~'/ 2 . The first step is to iden- 
tify the set H of 0-heavy hitters (i.e. the set of elements with fre- 
quency at least u x l 2 ) and their frequencies. We accomplish this 
via the (logw, l/01ogn) protocol described in Section 6.1. V runs 
this protocol and, as the heavy hitters are reported, V incremen- 
tally computes F' := Y.ieH M a i)> which can be understood as the 
contribution of the heaviest elements to F, the statistic of interest. 

In parallel with the heavy hitters protocol, V also runs the first 
part of the protocol of Section 3. 1 with d = \ogu. That is, V chooses 
a random location r = (ri, ...,/•</) € [p] d (where p is a prime chosen 
larger than the maximum possible value of F), and while observing 
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the stream V incrementally evaluates /a(r). As in Sections 2 and 
3.1, this requires only 0(d) additional words of memory. 

As the heavy hitters are reported, V "removes" their contribution 
to / a by subtracting a v ^ v (r) from / a (r) for each v <EH. That is, let 
/a denote the polynomial implied by the derived stream obtained 
by removing all occurrences of all <p -heavy hitters from A. Then 
V may compute f a (r) via the identity f a (r) = / a (r) - £ veH £ v (r). 
Crucially, V need not store the items in H to compute this value; 
instead, V subtracts Xv( r ) eacn ti me a heavy hitter v is reported, 
and then immediately "forgets" the identity of v. 

Now let h be the unique polynomial of degree at most u 1 / 2 such 
that h(i) — h(i) for i = 0, ...,u l l 2 ; V next computes h(f a (r)) in 
small space. Note that this computation can be performed without 
explicitly storing h, since we can compute 

H x ) =Li=o,... u mKi)Xi{x) 

(assuming h() has a compact description as in the examples below). 

The second part of the verification protocol can proceed in par- 
allel with the first part. In the first round, the prover sends a poly- 
nomial gi (x\ ) claimed to be 

Si ( x i) = Hx 2 ,... M eW- 1 h°fa(xi,X2, ■ ■ ■ ,x d ). 
Observe that if g\ is as claimed, then 

F(a)=L Xie[e]gl (x 1 )+F>-\H\h(0). 

Since the polynomial g\ (x\) has degree at most w 1 / 2 , it can be 
described in w 1 / 2 words. 

Then, as in Section 3.1, V sends rj_i to V in round j > 1. In 
return, the prover sends a polynomial gj(xj), and claims 

gj {xj ) = L Xj+i ,...,x d e\l]i-i h ° h (n , ■ ■ ■ , rj- 1 , xj , x j+ 1 , . . . , x d ) . 

The verifier conducts tests for correctness that are completely 
analogous to those in Section 3.1, which completes the description 
of the protocol. The proof of completeness and soundness of this 
protocol is analogous to those in Section 3.1 as well. 

Analysis of space and communication. V requires log u words 
to run the heavy hitters protocols, and 0(d) = O(logw) space to 
store r\,...,r d , / a (r), / a (r), and to compute and store h(f a (r)). 
The communication cost of the heavy hitters protocol is u l l 2 \ogu, 
while the communication cost of the rest of the protocol is bounded 
by the du 

V2 = B l/2 

log u words used by V to send a polynomial of 
degree at most w 1 / 2 in each round. Thus, we have the following 
theorem: 

THEOREM 6. Assume n = ®(u). There is a logw round, 
(\ogu,u^l 2 \ogu)-protocol for any statistic F of the form F(a) = 
£,-£[„] h(af), with probability of failure 0(^p). The verifier takes 
time 0(log«) per update. The prover takes time 0(« 3 / 2 ). 

Using this approach yields protocols for the following problems: 

• Fq, the number of items with non-zero count. This follows by ob- 
serving that Fq is equivalent to computing £,<=[„] /i(a,) for h(0) = 
and h(i) = 1 for (' > 0. 

• More generally, we can compute functions on the inverse distri- 
bution, i.e. queries of the form "how many items occur exactly 
k times in the stream" by setting, for any fixed k, h(k) = 1 and 
h(i) = for i ^ j. One can build on this to compute, e.g. the 
number of items which occurred between k and k' times, the me- 
dian of this distribution, etc. 



• We obtain a protocol for F max = max,- a,-, with a little more work. 
V first claims a lower bound lb on F max by providing the index 
of an item with frequency F max , which V verifies by running the 
INDEX protocol from Section 4. Then V runs the above protocol 
with h(i) = for i < lb and h(i) = 1 for > lb; if £,- 6 [ M ] fe(a,) = 0, 
then V is convinced no item has frequency higher than lb, and 
concludes that F max = lb. 

COROLLARY 2. There is a (log u, u l l 2 log u)-protocol that re- 
quires just \ogu rounds of interaction for Fq, F max , and queries on 
the inverse distribution. 

Comparison. Compared to the previous protocols, the methods 
above increase the amount of communication between the two par- 
ties by a mJ factor. The number of rounds of interaction remains 
logM, equivalent to V's space requirement. So arguably these bounds 
are still good from the verifier's perspective. In contrast, the con- 
struction of [14] requires ii(log u) rounds of interaction and com- 
munication, which may be large enough to be offputting. To make 
this concrete, for a terabyte-size input, log u rounds is of the order 
of 40, while log 2 u is of the order of thousands. Meanwhile, the 
«J communication is of the order of a megabyte. So although the 
total communication cost is higher, one can easily imagine scenar- 
ios where the latency of network communications makes it more 
desirable to have fewer rounds with more communication in each. 

7. CONCLUDING REMARKS 

We have presented interactive proof protocols for various prob- 
lems that are known to be hard in the streaming model. By dele- 
gating the hard computation task to a possibly dishonest prover, the 
verifier's space complexity is reduced to O(logw). We now outline 
directions for future study. 

Multiple Queries. Many of the problems considered are parame- 
terized by values that are only specified at query time. The results 
of these queries could cause the verifier to ask new queries with 
different parameters. However, re-running the protocols for a new 
query with the same choices of random numbers does not provide 
the same security guarantees. The guarantees rely on V not know- 
ing these values; with this knowledge a dishonest prover could po- 
tentially find collisions under the polynomials, and fool the verifier. 

Two simple solutions partially remedy this issue: firstly, it is safe 
to run multiple queries in parallel round-by-round using the same 
randomly chosen values, and obtain the same guarantees for each 
query. This can be thought of as a 'direct sum' result, and holds 
also for the Goldwasser et al. construction [14]. Secondly, V can 
just carry out multiple independent copies of the protocol. Since 
each copy requires only O(logw) space (more precisely logw+ 1 
integers), the cost per query is low. Nevertheless, it remains of 
some practical interest to find protocols which can be used repeat- 
edly to support an larger number of queries. Related work based 
on strong cryptographic assumptions has recently appeared [7, 12] 
but is currently impractical. 

Distributed Computation. A motivation for studying this model 
arises from the case of cloud computation, which outsources com- 
putation to the more powerful "cloud". In practice, the cloud may 
in fact be a distributed cluster of machines, implementing a model 
such as Map-Reduce. We have so far assumed that the prover op- 
erates a traditional centralized computational entity. The next step 
is to study how to create proofs over large data in the distributed 
model. A first observation is that the proof protocols we give here 
naturally lend themselves to this setting: observe that the prover's 
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message in each round can be written as the inner product of the in- 
put data with a function defined by the values of rj revealed so far. 
Thus, these protocols easily parallelize, and fit into Map-Reduce 
settings very naturally; it remains to demonstrate this empirically, 
and to establish similar results for other protocols. 

Other query types. From a complexity perspective, the main open 
problem is to more precisely characterize the class of problems that 
are solvable in this streaming interactive proof model. We have 
shown how to modify the construction of [14] to obtain (poly log u, 
poly log u) streaming protocols for all of NC, and we showed that a 
wide class of reporting and aggregation queries possess (log u, log u) 
protocols. It is of interest to establish what other natural queries 
possess (log u, log u) protocols: Fq and F max are the prime candi- 
dates to resolve; other targets include other common queries, such 
as nearest neighbors. Determining whether problems outside NC 
possess interactive proofs (streaming or otherwise) with poly log u 
communication and a verifier that runs in nearly linear time is a 
more challenging problem of considerable interest. This question 
asks, in essence, whether parallelizable computation is more easily 
verified than sequential computation. 
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APPENDIX 

A. RESULTS DUE TO PRIOR WORK 

Streaming Universal Arguments. A probabilistically checkable 
proof (PCP) is a proof in redundant form, such that the verifier need 
access only a few (randomly chosen) bits of the proof before decid- 
ing whether to accept or reject. A Universal Argument effectively 
simulates a PCP while ensuring V need not send the entire proof to 
V. We first describe this simulation, before describing a particular 
PCP system that, when simulated by a Universal Argument, can be 
executed by a streaming verifier. 

For a language L on input a, a Universal Argument consists of 
four messages: First, V sends V a collision-resistant hash function 
h. Next, an honest V constructs a PCP it for a, and then constructs 
a Merkle tree of Tt using h (the leaves of the tree are the bits of Tt) 
[20]. V then sends the value of the root of the tree to V. This effec- 
tively "commits" V to the proof Tt; V cannot subsequently alter it 
without finding collisions for h. Third, V sends V a list of the loca- 
tions of K he needs to query. Finally, for each bit b\ that is queried, 

V responds with the value of all nodes on b{% authentication path 
in the Merkle tree (note this path has only logarithmic length). V 
checks, for each bit bj that the authentication path is correct relative 
to the value of the root; if so, V is convinced V returned the correct 
value for as long as V cannot find a collision for h. The theorem 
follows by combining this construction with the fact that there exist 
PCP systems in which V only needs access to a in order to evaluate 
0(1) locations in the LDE / a . We now justify this last claim by 
describing such a PCP system. 

In [5], Ben-Sasson et al. describe for any language in NP a PCP 
system in which V is not given explicit access to the input; instead, 

V has oracle access to an encoding of the input a under an arbitrary 
error-correcting code (to simplify a little). In their PCP system, 

V runs in polylogarifhmic time and queries only 0(1) bits of the 
encoded input, and 0(1) bits of the proof Tt. Moreover, these bits 
are determined non-adaptively (specifically, they do not depend on 
a). We show this implies a PCP system that satisfies the claim for 



any L e NP. Indeed, let LDE(a) denote the truth-table of / a ; i.e. 
LDE (a) is a list of elements in the field Z p , one for each reZ^. 
There are (two-stage) concatenated codes whose first stage applies 
the LDE operation to the input a (and whose second stage applies 
a code to turn the field elements in LDE (a) into bits) that suffice 
as encodings of a [2]. Therefore, a streaming verifier with explicit 
access to the input a may simulate the verifier V in the PCP system 
of Ben-Sasson et al: each time V queries a bit b[ of the encoded 
input, there is a location r such that bi can be extracted from / a (r). 

A Universal Argument based on the PCP of the previous para- 
graph has two additional properties worth mentioning. First, since 
V need only query 0(1) bits of / a and otherwise runs in poly log 
time, we obtain a streaming verifier that runs in near-linear time. 
Second, since V need only query 0(1) bits of the proof, and the au- 
thentication path of each bit in the Merkle tree is of length O(logw), 
the communication cost of the Universal Argument is O(logw) 
words. Putting all these pieces together yields Theorem 2. 

Streaming "Interactive Proofs for Muggles." 2 In [14], V and V 
first agree on a circuit C of fan-in 2 that computes the function of 
interest; C is assumed to be in layered form. V begins by claiming a 
value for the output gate of the circuit. The protocol then proceeds 
iteratively from the output layer of C to the input layer, with one 
iteration for each layer. Let \W be the vector of values that the 
gates in i'-th layer of C take on input x, with layer 1 corresponding 
to the output layer, and let / v (,j be the LDE of vM. 

At a high level, in iteration 1, V reduces verifying the claimed 
value of the output gate to verifying / v(2 ) (r) for a random location 
r. Likewise, in iteration (', V reduces verifying / v (o to verifying 
/ T (.+i) (r') for a random r'. Critically, the verifier's final test requires 
only fy(d) (r) = / a (r), the low-degree extension of the input at the 
random location r, which can be chosen at random independent of 
the data or the circuit, and hence computed by a streaming veri- 
fier. Note that each iteration takes logarithmically many rounds, 
with a constant number of words of communication in each round. 
Therefore the protocol requires O(dlogu) communication in total. 
In particular, all problems that can be solved in log-space by non- 
streaming algorithms (i.e. algorithms that can make multiple passes 
over the input) possess polynomial size circuits of depth log 2 u, and 
hence there are (log w.log u) protocols for these problems. 

B. DETAILED PROOFS 
B.l Analysis of self-join size 

Proof of correctness. We now argue in detail that the verifier is 
unlikely to be fooled by a dishonest prover. 

LEMMA 1. If the prover follows the above protocol then the 
verifier will accept with certainty. However, if the prover sends 
any polynomial which does not meet the required property, then 
the verifier will accept with probability at most 2dl/p, where this 
probability is over the random coin tosses ofV. 

PROOF. The first part is immediate from the following discus- 
sion: if each gj is as claimed, then the verifier can easily ensure 
that each gj is consistent with gj-\. 

For the second part, the proof proceeds from the dth round back 
to the first round. In the final round, the prover has sent gj, of 
degree 21 — 2, and the verifier checks that it agrees with a pre- 
computed value at xj = rj. This is an instance of the Schwartz- 
Zippel polynomial equality testing procedure [24]. If gj is in- 
deed as claimed, then the test will always be passed, no matter 

2 This result was observed by Guy Rothblum; here, we present the 
details of the construction for completeness. 
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what the value of r d . But if g d does not satisfy the equality, then 
Pr[g d {r d ) = f 2 ( r )] < ^jr- Therefore, if p was chosen so that 
p > £, then the verifier is unlikely to be fooled. 

The argument proceeds inductively. Suppose that the verifier is 
convinced (with some small probability of error) that gj+i(xj + \) 
is indeed as claimed, and wants to be sure that gj(xj) is also as 
claimed. The prover has claimed that 



Sjixj) 



E 

x j+ i,...M^W-' 



/a( r l;- •• ! r j-ljXj,Xj+i, . ■ . ,X d ) 



We again verify this by a Schwartz-Zippel polynomial test: we 
evaluate gj{xj) at a randomly chosen point rj, and ensure that the 
result is correct. Observe that 



>x j+l ,...,x d e[t 



-jfl(n,...,rj,x j+u ...x d ) 



= Tx j+ i€[e\ T.xj +1 ,-,x d e[e\''-i- 1 fl( r i> ■ ■ ■ > rj,x j+ i,x j+2 , . . . ,x d ) 

Therefore, if the verifier V believes that g j +l is as claimed, then 
(provided the test passes) V has enough confidence to believe that 
gj is also as claimed. More formally, 



Pr 



Sj^ E fa(n,---rj-i,Xj,...,x d ) 
x j+ ,,...,x d e[i] d -i 



Sj+\= E fi(n,---rj,x j+ i,...,x d ) 

i j+ 2,.-Ae[f- J+1 



21 
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In the final step, the verifier is satisfied that g\ is consistent with g2, 
and so gi is as claimed. The probability that g\ is not as claimed 
can be bounded as the probability that the verifier was fooled in any 
intervening step. This is at most 2d£/p, by a union bound. 

Intuitively, the key reason for the prover's inability to fool the 
verifier is that the prover must commit to a particular gj before 
rj is revealed to him. So while the prover could then choose a 
gj + i which causes the test on that pair to pass, gj+\ is also "dis- 
honest". But ultimately, the prover must provide g d , which V can 
check based on information that is known to V alone. The prover 
is very unlikely to have included a dishonest gj along the way and 
passed all the subsequent tests to generate a g d which is consistent 
with the final test using r d (which remains unknown to V). □ 

Analysis of prover's costs. Besides the verifier's space and com- 
munication, this protocol is also quite efficient in terms of the other 
costs. Let us fix £ = 2. As the stream is being processed the verifier 
has to update the LDE / a (r). The updates are very simple, since 
Xo{x) = \-x and %\ (x) = x, so 



Xv(r) 



d 
7=1 



Thus processing each update in the stream 0(d) = O(logw) time. 

The prover has to retain the input vector a, which can be done ef- 
ficiently in space 0(min(«,n)). In the verification process it is clear 
that the verifier spends 0(1) time per round evaluating a degree-2 
polynomial, so the total time is O(logw). On the prover side, it 
might appear costly to compute each gj(xj) naively following the 
definition. But observe that gj (xj ) is a polynomial of degree 2, so it 
is sufficient to evaluate gj(xj) at three locations, say at Xj = 0, 1,2, 
to determine gj(xj). For a location Xj = c, we rewrite 

Sj( c ) = E fa{n,---,rj-l,c,x j+ i,...x d ) 

x 1+i ,...,x d &W-' 



E ( E a vXv( r U---,rj-i,c,x j+u ...x d )\ 

E E a n a V2 XM ] (n,---,rj-i,c,x j+ i,...x d ) 

x j+u ...s d eW-iv i ,v 2 eW 

■Xv 2 ( r U---,rj-\,c,Xj +u ...x d ) 

/ J-i 7-1 
= £ la Y , a V2 Y[ Xv Lk (l) • Xnj ( c ) ' II Xv 2 , k ( r k) • Xv 2J ( c ) 
vi,v 2 eW V k=l k=l 
d 

E ( II Xv,A x k)Xv 2 A x k)) 

x j+1 ...x d E[e\<>-} k=j+l 

Note that % Vk (x k ) = I iff x k = v k and for any other value in [£], 
for any pair of Vi , V2, we have 

E ( IT XnMjXviM)) =1 

if and only if V/+ 1 <k<d:v 1 j, = V2^< an d otherwise. Thus, 

7-1 

8j( c )= E {ay 1 ay 2 Y[Xv 1 , k ( r k) 

vi,v 2 e[e\ d ,Vj+l<k<d:v hk =v 2 , k k=\ 

7-1 

' Xvi J (c)Y[xv 2 M)Xv 2J (c)) 

k=l 

- E ( E {^X Vl (c)X\x Vk {r k ))f. 

v j+l ,..., Vd eW-i v l ,...,v j e[ey k=i 

7-1 

V maintains a y Y\_Xv k { r k) f° r eacn nonzero a T , updating with the 

k=\ 

new r k in each round as it is revealed in constant time. Thus the 
total time spent by the prover for the verification process can be 
bounded via O(nlogw), where n is the number of nonzero a v 's. 

We make one further simplification. At the heart of the compu- 
tation is a summation over [£]■> for each Vj + \,. ..,v d e [l] d ~i . As 
we set £ = 2, 

E ( a vXvj{c)Y[Xv t (r k ) 

vi,...,vje[^]A k=\ 

1 / 7-1 
= E \Xvj(c)- E i a vT[Xn( r k)) 

vj=Q V vu.-.w-ieffl- 1 k=i 

And for each Vj,...,v d e [£} d ^-' +l , we can decompose 

/ 7-1 
E \ a vY\Xv k {rk) 

vi,...,v J _ 1 sM^- 1 v k=l 

1 / 7-2 

= E Uvy-i(O-l) E {^X\Xv k {r k )) 

vj-i=0\ Vi,..., Vj - 2 e[l]J- 2 k=\ 

7-1 

By storing Aj [vj... v d ] = (a v Y[ Xv k {rk)) , V computes 

v,...v h ,e[t\>-^ k=\ 
A 7+lI v 7+l ■■■ V d] =Xo(rj)AjlO,v j+l ...v d ]+Xi(rj)Aj[l,v J+l ...v d ] 
in time 0(u/2 J ). The total time is 0(min(n log(w/n), u)), at most 
linear in u. Note that computing the F2 alone takes ®(min(n,w)) 
time, so there is at most a logarithmic factor more work than sim- 
ply providing the answer. 
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B.2 Analysis of sub-vector Protocol 

Proof of Theorem 5 . Correctness. It is clear that with an 
honest V , V always accepts. Next, we argue that if V returns a 
wrong value in any round, then t' t with high probability. V 
first sends back a'j for all qi<i< and their siblings (if they are 
outside of the range). Consider any pair of siblings, say a\ and a' j+l . 
Consider the functions f(x) = a,- + cii + \x and f(x) = d i + a' j+l x in 
the field Z p . If a,- ^ a\ or a, + i ^ a ' l+l , the two linear functions 
will not be identical, and they will intersect at no more than one 
point in \p\. Since we choose r\ uniformly randomly from [p], the 
probability that f(r\) = f \r\) is at most l/p. Thus, if Vs first 
message is not correct, with probability at least 1 — l/p, there will 
be at least one error in the computed y^'(i), qi < i < qR. The 
same argument applies to each of the following (logw — 1) rounds: 
if either of the siblings of T^iqi) and T (<?/?) returned by V is 
wrong or some Y^\i),qr, < i < 1r is already wrong previously, 
then with probability at most l/p, the reconstructed (i) will be 
all correct. By the union bound, the probability that an incorrect 
response from V will lead to a correct f ' is at most 1< ^" . 

Analysis of costs. We first argue that V can compute t in small 
space. Expanding t, we have 

log" ,. . 

i j=l 

where {i—l)j denotes the ;'-th least significant bit of the binary 
representation of i — 1. Initially when a = 0, we have t = 0; when 
we have a,- <— a,- + 8, t is incremented (modulo p) by At = 8 ■ 

Xt°i\ r ^ which is easily computed in O(logw) time. Thus V 
can maintain t by just keeping t, r\ , . . . , r\ ogu . 

The verifier's space requirement for the protocol is also bounded 
by O(logw) words. Given the query range, as the sub- vector result 
arrives at V, the verifier can keep track of only O(logw) hash values 
of internal nodes, corresponding to at most one child of yKil) and 
yj{qR) for each j. Combining these with the hash values provided 
by V will be sufficient to run the checking protocol. Each of these 
can be maintained in small space in the same manner as the root t 
via (8) above. Thus the space to carry out the protocol is O(logw). 

The communication cost consists of the initial query result of 
size k sent by the prover, plus 0(1) nodes per level of the binary 
tree T. So the total communication cost is 0(logu + k). 

Now we analyze the prover's cost. As the stream is received 
the prover clearly needs linear space and 0(1) time per element to 
construct the vector a. At verification time the prover essentially 
reconstructs the binary tree T. Note that T has at most n nonzero 
leaves, so it has size 0(min(«, nlog(w/n))). Computing this tree 
in a bottom-up fashion costs 0(1) time per node, hence 0(min(w, 
nlog(w/n))) time in total. □ 

Remarks. As in Section 3 the failure probability can be driven down 
to 0(^§^) for any constant c by picking p greater than u c , with- 
out changing the asymptotic bounds. From the description above a 
dishonest prover may cause excessive communication by sending 
more than k nonzero entries in the initial answer. To guarantee the 
0(log u + k) bound with any V, we could first verify the value of k, 
i.e., a RANGE-COUNT query, with O(logii) communication using 
the protocol in Section 3. Then if V sends more than k nonzero 
entries V will reject immediately. 

We note that by modifying the hash function to (1 — rj)vi, + r jVR, 
it is possible to show that t is equivalent to the LDE /(r), while the 
same analysis holds. This provides a connection between the two 
approaches, although the proofs are quite different in nature. 
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